Unify multiplier-bootstrap RNG: drop Rust weight helper, numpy canonical#362
Unify multiplier-bootstrap RNG: drop Rust weight helper, numpy canonical#362
Conversation
Closes the silent-failures audit follow-up for the Rust multiplier-weight RNG. `rust/src/bootstrap.rs::generate_bootstrap_weights_batch` seeded `Xoshiro256PlusPlus::seed_from_u64(seed + i)` per row while `diff_diff.bootstrap_utils.generate_bootstrap_weights_batch_numpy` ran under `numpy.random.default_rng` (PCG64), so the same `seed` produced different Rademacher/Mammen/Webb weights depending on whether the Rust backend was compiled in — and therefore different multiplier-bootstrap SE / CI / p-values in CallawaySantAnna, ContinuousDiD, ImputationDiD, TwoStageDiD, ChaisemartinDHaultfoeuille, and EfficientDiD. Unlike the TROP bootstrap fix (PR #354), this Rust helper did nothing after generating weights, so threading pre-generated weights through PyO3 would have left the function as a pointless pass-through. Deleted the `rust/src/bootstrap.rs` module entirely together with the orphaned `rand` + `rand_xoshiro` Cargo deps; the Python shim now calls the numpy implementation directly (with `_numpy` kept as an alias for backward compatibility). The three distributional property tests move to `tests/test_bootstrap_utils.py::TestBootstrapWeightsBatchDistributions` including byte-level seed-pinned baselines for each distribution. `test_rust_bootstrap_weights_batch_is_removed` mirrors the existing `test_compute_synthetic_weights_is_removed` pattern as a regression guard against accidental re-export. Users running with the Rust backend compiled in will see different-but-equally-valid bootstrap SE / CI / p-values under the same `seed` (PCG64 replaces Xoshiro); pure-Python users are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Overall Assessment ✅ Looks good. No unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
The helper accepts any ``np.random.Generator``, not specifically PCG64; library callers happen to seed via ``np.random.default_rng``. Clarifies that backend invariance is with respect to the supplied generator state, not a specific engine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
…value The PR's RNG unification is working as intended — Rust now produces the same bootstrap SE as pure-Python under the same seed — but the pinned `TestBootstrapCellPeriod::test_bootstrap_se_matches_pre_pr4_baseline` guard held two separate per-backend baselines from the pre-RNG-unified era. The Rust backend now produces exactly the value that was formerly pinned as the Python-only baseline (0.3030802540369796), so the split is obsolete. Collapse the two constants into one, drop the HAS_RUST_BACKEND branch + os env-var check, and document the convergence in the comment. The regression guard's semantic is unchanged: under PSU=group the dispatcher still routes through the legacy group-level bootstrap path, and the assertion is still ULP-precision bit-identity against the captured baseline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good — no unmitigated P0/P1 findings. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
|
Closing without merging. On reflection, the correctness argument is really about cross-backend seed parity, not correctness per se — each backend independently produces valid draws from the specified multiplier distribution. The measured weight-generation slowdown is meaningful (numpy PCG64 vs Rust Xoshiro+rayon):
At typical practitioner scales ( If a future session revisits this, the more promising path is a Rust helper that accepts Python-generated uniform bytes (not weights) and does the bucket/sign transform in parallel — preserves backend invariance without giving up rayon throughput. |
Summary
rust/src/bootstrap.rs::generate_bootstrap_weights_batchseededXoshiro256PlusPlus::seed_from_u64(seed + i)per row whilediff_diff.bootstrap_utils.generate_bootstrap_weights_batch_numpyconsumednumpy.random.default_rng(PCG64). Sameseed→ different Rademacher/Mammen/Webb weights depending on whether the Rust backend was compiled in, and therefore different bootstrap SE / CI / p-values in CallawaySantAnna, ContinuousDiD, ImputationDiD, TwoStageDiD, ChaisemartinDHaultfoeuille (multi-horizon PSU + group-level + PSU-broadcast multiplier bootstrap), and EfficientDiD (cluster + unit multiplier bootstrap). No warning fired.rand+rand_xoshiroCargo deps) and collapse the Python shim to call the existing numpy implementation directly. Unlike the TROP bootstrap fix (PR Fix TROP bootstrap SE backend divergence under fixed seed #354) the Rust helper had no post-RNG work, so threading pre-generated weights through PyO3 would have left a pointless pass-through — same shape as thecompute_synthetic_weightsdelete-and-inline from PRs Delete compute_synthetic_weights shim; inline Frank-Wolfe in rank_control_units #344/Remove dead Rust compute_synthetic_weights (follow-up to PR #344) #345.stratified_bootstrap_indices. Neither imports the deleted helper.seedafter this change (numpy PCG64 replaces Rust Xoshiro). Pure-Python-backend users are unaffected. Sampling distributions are unchanged.TestBootstrapWeightsBatchDistributionsclass intests/test_bootstrap_utils.py— 11 tests including byte-leveldefault_rng(42)seed pins for each of Rademacher, Mammen, Webb. Newtest_rust_bootstrap_weights_batch_is_removedregression guard intests/test_rust_backend.pymirrors the existingtest_compute_synthetic_weights_is_removedpattern.Net diff: +164 / -434 across 10 files. Closes the silent-failures audit TODO row for the Rust multiplier-weight RNG.
Test plan
maturin develop --releasebuilds cleanly withrand/rand_xoshirodropped fromrust/Cargo.toml.from diff_diff._rust_backend import generate_bootstrap_weights_batchraisesImportError.tests/test_bootstrap_utils.pygreen (36 tests including the new 11-test distributional class with seed-pinned baselines).tests/test_rust_backend.pygreen (86 tests including both removal regression guards).test_staggered.py,test_continuous_did.py,test_imputation.py,test_two_stage.py,test_efficient_did.py,test_chaisemartin_dhaultfoeuille.py.DIFF_DIFF_BACKEND=python.🤖 Generated with Claude Code